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(57) Abstract 

The invention relates 
to a method for generating a 
large static image M(n), such 
as a sprite or a mosaic, from 
a video sequence. This method 
comprises, in view of a first 
accretion step, an estimation 
of the motion parameters 
related to the video objects 
of the sequence, with respect 
to the previously generated 
static image. Each video 
object is then warped based 
on said parameters, and the 
warped video objects are 
blended with the previously 
generated static image. The 
method also comprises (n-1) 
further accretion steps, applied 
this time to the same video 
sequence considered in the 
reverse order. Each additional 
accretion step itself includes 
a warping sub-step, based on 
each successive video object 
considered in said reverse order 
and on the corresponding motion 
parameters already estimated, 
and a blending sub-step, 
pixel-based or region-based 

weighting coefficients being then computed to be taken into account during the blending steps. 
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Static image generation method and device. 



PCT/EP99/05513 



FIELD OF THE INVENTION 

The invention relates to a method for generating a large static image, such as a 
sprite or a mosaic, from a video sequence including successive video objects VOs, said 
method comprising, in view of the accretion of said static image, the steps of : 
5 (A) estimating motion parameters related to the current video object V0(n) of 

the sequence, with respect to the previously generated static image ; 

(B) warping said current video object V0(n), on the basis of said estimated 
motion parameters ; 

(C) blending the warped video object WV0(n) thus obtained with said 
10 previously generated static image ; 

and to a corresponding device. This invention may be useful in relation with the MPEG-4 and 
MPEG-7 standards. 

BACKGROUND OF THE INVENTION 

15 The descriptors and description schemes that will be standardized within the 

frame of MPEG-7 (MPEG-7 has for object to standardize within some years generic ways to 
describe multimedia content) will allow fast and efficient retrieval of data, on the basis of 
various types of features such as text, color, texture, motion and semantic content. In this 
MPEG-7 context, a mosaic can also play a useful role, as it will be shown. 

20 Sequences, video shots and key-frames follow a hierarchical structure : a video 

shot is a particular sequence which shows a single background, while a key-frame is a visual 
representation in only one image of this shot. A visual representation of a video sequence can 
be obtained by the extraction of key-frames from a previous shot partition of the whole 
sequence. The process then chooses as key-frame one image of each shot, so that it only shows 

25 a part of the video shot that may not be the most reliable one for representation. A mosaic 
seems however to be a better choice than a key-frame, when it is wanted to show the whole 
video shot in a single panoramic view of background information. As explained for instance in 
the article "Efficient representations of video sequences and their applications", M. Irani and 
al., Signal Processing : Image Communication, voL8, 1996, pp.327-351, a mosaic image is a 
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kind of large static image constructed from all frames in a scene sequence, giving a panoramic 
view of said scene. From this panoramic view, it is then possible to extract for instance the 
main features of the sequence, such as chrominance or luminance histograms, objects shapes, 
global motion parameters, and so on (all these features constitute relevant standard descriptors 
5 for MPEG-7 and are useful for MPEG-7 compliant search engines). 

The definition of a mosaic may be compared to that of a sprite, as used in the 
context of the MPEG-4 standard. A sprite is a large static image composed of the pixels in an 
object visible through an entire sequence. This static image forms a panoramic view whose 
some portions may not be visible in some frames because of foreground objects or camera 

10 motion. If all the relevant pixels throughout the entire sequence are collected, a complete 
panoramic view (called background sprite) is obtained, which can be efficiently transmitted 
(or stored) and used later for re-creating portions of frames. 

As described for instance in the case of a sprite in the document W0 98/59497 
(but this description can be applied to the case of a mosaic), three main steps may compose a 

15 sprite or mosaic generation (in the following, the generic word "static image" will be therefore 
used in place of sprite or mosaic). A motion estimation step is first provided, in order to find 
the motion parameters that will allow to merge correctly a current frame F(n) with the static 
image M(n~l) already composed of the previous frames F(l), F(2),..., F(n-1). The inverse 
parameters are then computed, so that the current frame may be compensated in the direction 

20 of these inverted parameters ; this second step is also called warping. The warped current 

frame F(n) is finally blended with M(n-l) in order to form a new accreted static image M(n), 
with which the next incoming frame F(n+1) will be merged, and so on. 

However, an observation of the obtained static image may lead to consider that 
some parts of said static image are not clean. For instance, some parts of a moving object have 

25 not been completely removed. 



SUMMARY OF THE INVENTION 

It is therefore an object of the invention to propose a method allowing to 
generate a static image without such artefacts. 
30 To this end, the invention relates to a method such as described in the 

introductory paragraph of the description and which is moreover characterized in that : 

(1) said method also comprises (n-1) further accretion steps applied to the same 
video sequence considered in the reverse order, each additional accretion step itself including 
a warping sub-step, based on each successive video object considered in said reverse order and 
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on the corresponding estimated motion parameters, and a blending sub-step, provided between 
the warped video object thus considered and the static image generated at the end of the 
previous accretion step. 

It is another object of the invention to propose a static image generation device 
5 for carrying out said method. 

To this end, this invention relates to a device for generating a large static image, 
such as a sprite or a mosaic, from a video sequence including successive video objects VOs, 
said device comprising, in view of the accretion of said static image in a first accretion stage : 

(A) a motion estimation circuit, provided for estimating a motion information 
10 related to the relative motion between the current video object VO(n) of the sequence and the 

previously generated static image ; 

(B) a first warping circuit, provided for defined on the basis of said current 
video object and said motion information a warped video object WVO(n) ; 

(C) a first blending circuit, provided for blending the warped video object 

15 WV0(n) thus obtained with said previously generated static image, said previously generated 
static image being thus updated by replacement by the new one ; 

characterized in that it also comprises at least a further accretion loop including a second 
warping circuit, a second blending circuit provided for blending the warped video object thus 
obtained with the previously generated static image, and a memory for storing said generated 

20 static image, said memory and said second warping and blending circuits being organized for 
carrying on the updated static image available at the output of the first blending circuit (n-1) 
additional accretion steps taking into account on the one hand the same video sequence but 
considered in the reverse order and on the other hand the estimated motion information 
corresponding to each concerned video object of said reverse sequence, and, between said 

25 warping and blending circuits of said first accretion stage or of anyone of said further 
accretion loops, means for computing for each picture element a weighting coefficient 
correlated to the error between the warped video object and the generated static image and 
taken into account by the blending circuit during the blending step, 

30 BRIEF DESCRIPTION OF THE DRAWINGS 

The particularities and advantages of the invention will now be 
explained in a more detailed manner, with reference to the accompanying drawings in which : 
- Fig.l shows a known scheme of a static image generation device ; 



WO 00/08860 4 PCT7EP99/05513 

- Fig.2 shows an embodiment of a static image generation device allowing to 
implement the method according to the invention. 
DESCRIPTION OF THE INVENTION 

A device for the implementation (for instance in the case of a mosaic) 
5 of the method described in the document WO 98/59497 previously cited is illustrated in Fig. 1. 
A motion estimation circuit 11 receives successive video objects, in the present case 
successive frames F(l), F(2), F(3),..„ F(i),..., F(n-1), F(n), and determines the motion 
parameters that will allow to merge correctly the incoming frame F(n) with the previously 
generated mosaic M(n-1) available in a memory 12. After having estimated these parameters, 
10 they are used in a warping circuit 13 that transforms the video object to the coordinate system 
of the mosaic M(n-1). A blending circuit 14 finally allows to build the new mosaic M(n) 
refreshing the old one M(n-l). 

The principle of the method according to the invention is to apply several times 
the accretion step, once a first mosaic such as M(n) has been built. These further accretions 
15 begin this time at the last frame and finish with the first one, which allows to give less 

importance to the new previous frames and leads to a cleaner static image. In these additional 
accretion steps, the previously generated mosaic is now taken as a reference to build an error 
map that will be useful in view of the following blending step. 

A device for the implementation of said method is shown in Fig.2 and comprises the following 
20 elements. A first accretion stage 200 allows to generate a first mosaic Mi(n) according to the 
scheme of Fig. 1. To this end, said mosaic generation stage 200 comprises a motion estimation 
circuit, a memory, a warping circuit and a blending circuit identical to the corresponding 
elements 1 1 to 14 and working in the same manner, and therefore not shown. In the same time, 
the successive input frames F(l), F(2), F(3)..., F(i),..., F(n) are stored in a buffer 201. The 
25 output of the buffer 201, read by beginning at the last frame (in order to finish with the first 
one), is sent towards a second accretion stage 300. Said stage 300 itself comprises in series a 
(second) warping circuit 331, that receives on the one hand said output of the buffer 201 and 
on the other hand the corresponding motion parameters previously determined (for each of 
these frames now considered in the reverse order) in the first accretion stage 200, a pixel- 
30 based weighting circuit, and a (second) blending circuit 334. 

According to the invention, an error map is constructed in a circuit 332 by comparison 
between the output of the second warping circuit 331 and the output Mi(n) of the first 
accretion stage 200. The pixel-based weighting circuit, including said circuit 332 and a 
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coefficient computation circuit 333, computes for every picture element (pixel) a weighting 
coefficient w^n), given by the following expression (1) : 

wwF ( n)[x,y] = — ^— ~ P(r(x, y)) (1) 
r(x, y) dr 

where n is a symmetrical, positive-definite function known as the lorentzian M-estimator and 
r(x,y) is the error between the warped current image and the mosaic at the pixel (x,y). The 
whole set of weighting coefficients thus computed by the pixel-based weighting circuit (332, 
333) is then used by the blending circuit 334. In said circuit, a weight mean formula taking 
into account the weighting coefficients WwF<n)[x,y] is then used to calculate the luminance and 
chrominance values of a new mosaic M 2 (n) resulting from the blending step. The blending 
formula (2) is indeed, for each pixel [x,y] : 

w M 2 (n -l)[x, y].M 2 (n -l)[x, y] iW WF(n) [x, y].WF(n)[x ,y] 
M 2 (n)[x, y] w M 2 (n-l) +w WF(n) " < 2 > where the 

definitions of the terms are the following : 

(a) n>0 

(b) whatever (x,y), w M 2«»= 0 ; 

(c) whatever (x,y), w W F(n)[x,y] = ; 1 ; - j- P(r(x, y)) ; 

r(x, y) dr 

(d) W M 2(n)= W M 2(n-J) + WwF(n> 

The mosaic M 2 (n) thus generated at the output of the second accretion stage 300 
is stored in a (second) memory 202 for refreshing the previously generated mosaic M 2 (n-1). 

This process of accretion is then iteratively reproduced for each successive 
frame read from the buffer 201 (up to the first one F(l)). The motion parameters 
corresponding to each of these frames are available in the stage 200 (they have been defined 
during the first accretion step implemented in said stage), and the newly generated mosaic 
M 2 (n) is stored in the memory 202, in order to be available for each following blending step 
(the output of said memory 202 is received 
by the blending circuit 334). 

The present invention is obviously not limited to the previous embodiment. The 
blending steps carried out in the blending circuit may be preceded by preprocessing sub-steps 
such as described for instance in the european patent filed on August 5, 1998, with the filing 
number n°98401997.6 (PHF98584). This document describes inter alia a mosaic generation 
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method incorporating an additional weighting sub-step between the warping and blending 
steps. For each pixel (picture element) of the considered video object (such as a frame), a 
weighting coefficient correlated to the error between the warped video object and the 
previously generated mosaic is computed, and the blending formula then takes into account 

5 each of said weighting coefficients. It is also possible to include between this additional 
weighting sub-step and the blending step a spatial filtering sub-step, based for example on a 
morphological segmentation and provided for converting said pixel-based weighting operation 
into a region-based one allowing to detect and discard regions considered as outliers. These 
preprocessing sub-steps may also be used before the blending step carried out in the blending 

10 circuit of the mosaic generation stage 200. 

It must also be indicated that the invention is not dependent of the type of the 
video sequence. In the described example, the video sequence comprises successive frames 
F(l), F(2),. . F(n-l), F(n) of rectangular shape, but it is clear that it may comprise any type of 
video objects, for example video objects (VOs) of any type of shape such as defined in relation 

15 with the MPEG-4 standard according to object-oriented segmentation schemes. The term 
"video object" will be therefore chosen as representing here any type of video information 
such as processed according to the method and device described hereinabove, and such video 
objects will be designated by the references V0(1), V0(2),. . . , V0(n-1), V(n). 
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CLAIMS: 



1. A method for generating a large static image, such as a sprite or a mosaic, from 
a video sequence including successive video objects VOs, said method comprising, in view of 
the accretion of said static image, the steps of : 

(A) estimating motion parameters related to the current video object V0(n) of 
5 the sequence, with respect to the previously generated static image ; 

(B) warping said current video object V0(n), on the basis of said estimated 
motion parameters ; 

(C) blending the warped video object WV0(n) thus obtained with said 
previously generated static image ; 

10 characterized in that : 

(1) said method also comprises (n-1) further accretion steps applied to the same 
video sequence considered in the reverse order, each additional accretion step itself including 
a warping sub-step, based on each successive video object considered in said reverse order and 
on the corresponding estimated motion parameters, and a blending sub-step, provided between 

15 the warped video object thus considered and the static image generated at the end of the 
previous accretion step. 

2. A method according to claim 1, characterized in that it also comprises, 
between the warping and blending steps of the first accretion step or of anyone of said further 

20 accretion steps, an additional step for computing, for each picture element of the current video 
object V0(n), a weighting coefficient w W F(n)[x,y] correlated to the error between the warped 
video object WV0(n) and the generated static image at each picture element [x,y], the blending 
step provided for determining the newly generated static image then taking into account said 
weighting coefficients. 

25 

3. A method according to claim 1, characterized in that it also comprises, 
between the warping and blending steps of the first accretion step or of anyone of said further 
accretion steps, an additional step itself including : 

- a first pixel-based error map definition sub-step, provided for 
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constructing, for each picture element of the current video object V0(n), a map of the error r(x, 
y) between the warped video object WVO(n) and the static image at said picture element [x,y] ; 

- a second spatial filtering sub-step, provided for transforming said pixel- 
base error map into a region-based error map ; 
5 - a thrid weighting sub-step, provided for computing for every pixel a 

weighting coefficient wwF<n)[x,y] correlated to said error and in the same time for selecting 
regions that belong to foreground objects and discarding them as outliers before the blending 
step, said blending step provided for determining the newly generated static image then taking 
into account said weighting coefficients. 

10 

4. A device for generating a large static image, such as a sprite or a mosaic, from a 

video sequence including successive video objects VOs, said device 
comprising, in view of the accretion of said static image in a first accretion stage : 

(A) a motion estimation circuit, provided for estimating a motion information 
15 related to the relative motion between the current video object VO(n) of the sequence and the 

previously generated static image ; 

(B) a first warping circuit, provided for defined on the basis of said current 
video object and said motion information a warped video object WVO(n) ; 

(C) a first blending circuit, provided for blending the warped video object 

20 WV0(n) thus obtained with said previously generated static image, said previously generated 
static image being thus updated by replacement by the new one ; 

characterized in that it also comprises at least a further accretion loop including a second 
warping circuit, a second blending circuit provided for blending the warped video object thus 
obtained with the previously generated static image, a memory for storing said generated static 
image, said memory and said second warping and blending circuits being organized for 
carrying on the updated static image available at the output of the first blending circuit (n-1) 
additional accretion steps taking into account on the one hand the same video sequence but 
considered in the reverse order and on the other hand the estimated motion information 
corresponding to each concerned video object of said reverse sequence, and, between said 
warping and blending circuits of said first accretion stage or of anyone of the further accretion 
loops, means for computing for each picture element a weighting coefficient correlated to the 
error between the warped video object and the generated static image and taken into account 
by the following blending circuit during the blending step. 
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